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Abstract. The parallel dynamics of the asymmetric extremely diluted Ashkin-Teller neural network is 
studied using signal-to-noise analysis techniques. Evolution equations for the order parameters are derived, 
both at zero and finite temperature. The retrieval properties of the network are discussed in terms of the 
four-spin coupling strength and the temperature. It is shown that the presence of a four-spin coupling 
enhances the retrieval quality. 

PACS. 64.60.Cn Order-disorder transformations; statistical mechanics of model systems - 75.10. Hk Clas- 
sical spin models - 87.10.-|-e General, theoretical, and mathematical biophysics - 02.50.-r Probability 
theory, stochastic processes and statistics 



1 Introduction 

Recently, the equilibrium properties of the Ashkin-Teller 
neural network (atnn) have been studied in The 
neurons of the atnn are described by two Ising spins of 
different types. This allows the network to store and to 
] retrieve pairs of patterns. Therefore, more complicated 
, information can be stored in the ATNN than in the Hop- 
' field model e.g., the fore- and background of a pic- 

ture. Every spin is connected to spins of the same type. 
' In addition, the neurons are connected to each other. The 
' connections linking the neurons are four-spin couplings, 
. since they connect two pairs of spins, one pair per neu- 
' ron. This allows the network to retrieve both patterns of 
a pair simultaneously. One can think of the model as a 
' combination of two Hopfield models, each retrieving one 
' of the patterns. The four-spin coupling is then a connec- 
tion between both models. The underlying idea is that the 
simultaneous retrieval of a pair of patterns is easier than 
, the independent retrieval of the patterns in the pair. 
■ There are various reasons for studying this model. The 
Ashkin-Teller spin glass is related to disordered systems 
where the disorder evolves on a time scale that can be 
tuned [^. The introduction of a neuron containing dif- 
ferent types of spins is also neurobiologicaly motivated 
by the fact that areas in the brain exist which react to 
two different kinds of dependent stimuli in such a way 
that the response to particular combinations of these stim- 
uli is stronger than the response to others Finally, in 
neuropsychological studies on amnesia, it has become ap- 
preciated that memory is composed of multiple separate 
systems which can store different types of informations, 
e.g., information based on skills and informations based 
on specific facts or data . 



In the thermodynamic and retrieval properties 

of the ATNN have been studied using replica-symmetric 
mean-field theory. In the present paper, we analyse the 
parallel dynamics of the asymmetric extremely diluted 
version of the model. Both the way how the system evolves 
to its equilibrium configuration and the properties of the 
equilibrium configuration itself are subjects of interest. It 
is known j^,^ that the dynamics in symmetric architec- 
tures, even in the diluted case, is complicated in a non- 
trivial way because of correlations between the neuron 
states. These correlations are caused by feedback loops 
and common ancestors. In contrast to the Hopfield model, 
where the dynamics has been solved taking into account 
all the correlations the presence of two types of 

spins makes the analysis of the correlations in the ATNN 
very complicated. The underlying reason is the existence 
of two sources of correlations. First, feedback loops appear 
due to the two-spin interaction, as in the Hopfield model. 
Second, the four-spin coupling causes correlations between 
spins of different type. Therefore, in order to arrive at a 
first insight in the dynamics of the model, we limit our- 
selves to its asymmetric extremely diluted version where 
all correlations between the neuron states are eliminated 

[illl. 

Using standard signal-to-noise analysis techniques (see, 
e.g, refs. §,||,|l|l), we find that the local field of the 
asymmetrically diluted atnn contains only a normally dis- 
tributed part, besides the signal. As observed already for 
the asymmetric diluted Hopfield model jl2j , the structure 
of the local field does not change in time. This allows us 
to write down immediately the complete time evolution of 
the main overlaps. 

The rest of the paper is organised as follows. In the 
second section, we define the model as an extension of 
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the Hopfield model. We introduce parallel dynamics at 
arbitrary temperature, and define the main overlaps (one 
for each type of spins) as macroscopic measures for the 
retrieval quality. In Section ||, we use signal-to-noise anal- 
ysis techniques in order to write down the evolution equa- 
tions at arbitrary time. From the evolution equations, the 
fixed-point equations are obtained. These equations lead 
to the dynamical capacity-temperature diagram presented 
in Section H Finally, we give some concluding remarks in 
Section ^. 



2 The model 

The ATNN is defined as a neural network consisting of N 
neurons. Each of the neuron states is described by two 
spins with value ai and Si (i — 1,...,N), both taken 
from the discrete set { — 1, +1}. For each type, the spins i 

and j are coupled by a two-spin interaction J^p and J^^ 
respectively, while the neurons i and j are coupled by a 

(3) 

four-spin interaction J^j . We assume no diagonal terms 

viz. Jif)=0, 2; = 1,2, 3. 

A configuration of an ATNN consists out of a cr- and 
s-part viz. 



ia{t) = {a,{t)},s{t) = {sj{t)}); j = 1, 



,iV. 



(1) 



Given such a configuration, we define three types of local 
fields: two Hopfield-like local fields which measure the in- 
coming signal to the spins ai and Si, caused by the spins 
of the same type 



N 



N 



^N.i 



(2) 



and, in addition, a local field which measures the incom- 
ing signal to neuron i, caused by both spins of the other 
neurons 



N 



'■N 



(3) 



In the sequel, we write the shorthand notation h^^\{t) = 
h'^^\{S^it)), X = 1,2 with Si(t) = cr(t), S^it) = s{t), and 



h'-^]^{t) = /i^^^(cr(i),s(t)). The configuration (cr(0),s(0)) is 
chosen as input. At temperature T = 1/(5, all neurons are 
updated in parallel according to the transition probability 



,(3) 



Pr((7,(t + 1) 

_ 1 

~ 2 
Vr{s,{t + l) 

_ 1 

~ 2 



1 



1 



= a\a{t)Mt)) 

tanh/3a (h%\{t) + s,{t)hf S) 
^s\a{t)Mt)) 

tanh/3s [h^N^t) + ^^{t)h%{t) 



(4) 



We assume hereby that both types of spins exhibit the 
same degree of stochasticity. At zero temperature, this 
dynamics becomes deterministic and is given by 



SxA^ + 1) = sign 



41: (t) 



SiAt)h%{t) 



(5) 



with x,x = 1,2 and x ^ x. The <T-spins receive at each 
time input from the s-spins and vice versa due to the term 
containing h''^\(t). 

The aim of the network is to store simultaneously pi 
patterns {^^}, ^J■ = 1, . . . ,pi in the cr-part of the network 
andp2 patterns {^7^}, /i = 1, . . . ,p2 h^ the s-part. All com- 
ponents of the patterns and ryf are i.i.d.r.v. taken from 
{—1,-1-1} with zero mean {^'/) = = (77^) and indepen- 
dent type by type (^iVj) = = 1, . . . , ^). In order 
to store these embedded patterns, the two-spin couplings 
are chosen according to the Hebb rule 



(1) _ Ji 



j:' = ^ 



pi 



T 



N ^ 



mm ■ 



(6) 



The four-spin interaction is, also in analogy to the Hebb- 
rule, defined as ||l| 



P3 



N ^ 



(7) 



Under the assumption of independent embedded patterns 
and T]^, we consider the following form 



(8) 



The patterns {7^}, /i = 1, . . . ,P3 are then a set of i.i.d.r.v. 
taken from {—1, -1-1} with zero mean. In the literature, this 
choice of patterns is called the linked case iQ . 

The variables Jy, y = 1,2,3 are constants scaling the 
relative importance of all types of couplings. We choose 
in the sequel Ji = J2 since we want both types of spins 
interchangeable for simplicity. The relative scale of the 
temperature and coupling strengths is fixed by choosing 
Ji = 1 . Finally, we define J = J3 / Ji such that this quan- 
tity measures the relative strength of the four-spin cou- 
plings with respect to the two-spin couplings. In the limit 
J ^ the ATNN becomes, at least in structure, the equiva- 
lent of two independent Hopfield models since h^^\{t) = 
at all times. We use the temperature T and the relative 
coupling strength J as independent variables. 

In the sequel, we take the interactions asymmetric ex- 
tremely diluted [f2|,pl 



iv) 



withcj, > and Prjclj^ = a] = {l~Cy/N)5afi+{cylN)Sa.i- 

The variables {c^J"*} are independent for each pair 
representing both the asymmetry and the dilution. The 
diagonal terms are excluded elf-* = 0. The structure of the 
architecture of the network then becomes a directed tree 



2/= 1,2,3 



(9) 
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with an average number of incoming and outgoing connec- 
tions (type by type) both equal to Cy. It is assumed that 
Cy <C N . The system is first diluted by taking the limit 
N oo. Afterwards, the number of incoming signals per 
site is made extensive by taking the limit Cy oo. The 
probability to have feedback in the system is now zero and 
the correlations are treelike. 

In principle, all couplings can be diluted independently 
(viz. ci 7^ C2 7^ C3). For convenience, however, we dilute 
them in the same way. 



_ _ (2) _ (3) ^ 



.(1) 



Ci = C2 = C3 = C . 



(10) 



This means that both spins of a neuron get information 
from the same neurons and that the dynamics of both 
types of spins can be treated analogously. Since the num- 
ber of embedded patterns is of the same order as the num- 
ber of connections a spin has with spins of the same type, 
we have pi = p2 = P3 = ac = p. In what follows, we write. 



for simplicity, J^-J^ instead of J-^ . 

At this point, we note that the capacity of the ATNN is 
defined as the ratio of the number of patterns stored in the 
network and the number of couplings to a neuron. In this 
model where we want to store 2p patterns {^'^,7'^}, all 
neurons have in average 3c links to the other neurons: 2c 
two-spin couplings and c four-spin couplings. Therefore, 
the capacity of the ATNN equals 



iv) 



OATNN 



2p 
3c 



-a . 



(11) 



The retrieval quality of the model is measured by the 
Hamming distance between the microscopic state of the 
network and the stored patterns 



1 ^ 

d.(.AS,S.(t))^-^[<,-5.,.(t)] 



(12) 



(13) 



where = 1, . . . ,p, x = 1,2, il'i = ^'^ and ^p2 — v'^- This 
naturally introduces the main overlaps 

1 ^ 

<ArW = ]^E<^^-.^W /^ = 1---P- (14) 



In the diluted model the sum 



in (|T|) has tt 

1 Y-^v ^ 1 



to be taken over 



the tree-like structure, viz. ~^ c^i=i '^v ■ The 

expression for the main overlap (O) then reads 



TO, 



1 ^ 



i=l 



We remark that both expressions (|l|) and (|l|) become 
equal in the thermodynamic limit c, — > 00. 



3 Dynamics 



In this section we construct a set of recursion equations 
for the main overlap order parameters. We use hereby 
signal-to-noise techniques (see, e.g., refs. jl3[|l^). Finally, 
we write down the fixed-point equations. 

Suppose an initial spin configuration (cr(0), s(0)). The 
configurations Sx{0) — {Sx,i{0)}, i — 1,...,N are col- 
lections of i.i.d.r.v. with mean (5'x,i(0)) = and variance 

^(<5'a;,i(0))^^ — 1. Spins of different types are uncorrelated 

((Ti{0)sj{0)) = = 1,...,N). Both types are corre- 
lated with only one of the stored patterns, e.g., the first 
one 



(16) 



The site by site independence of spins and patterns implies 
by the law of large numbers (lln) that we get for the main 
overlaps 

mliO) EE hm mi,,,^(0) = (i^l^S^M) = ^lo ■ (17) 

We now want to study how the main overlaps evolve 
under the parallel dynamics specified before. For a general 
time step and at T = 0, we find from ( |l5| ) and the LLN in 
the limit c, N 00 



^lrjlsign[h'l'\t) + s,it)h<f\t)) 

xsign [hf\t)+a.,{t)hf\t)))) (18) 



where x, i = 1, 2; x ^ x. The average ((•)) denotes the av- 
erage both over the distribution of the embedded patterns 
{^f } and {?/f } and the initial configuration {(7^(0), Si(0)}. 
The average over the latter is hidden in an average over 
the local fields through the updating rule (^. 

The equations (18|) show that the knowledge of the 
distribution of the local field at successive time steps is 
sufficient in order to find the evolution equations for the 
order parameters. We start with calculating the distribu- 
tion of the local field of the cr-spins at t = 0. Using the 
definitions (Q) and applying the signal-to-noise analysis, 
we have 



''NA 



(0) 



N 

09) 

The signal term, i.e., the first term on the r.h.s. of ([19|), 
is nothing but the main overlap ( |l5|) multiplied by In 
the noise part, i.e., the second term on the r.h.s., all terms 
are uncorrelated by construction such that we can apply 
the central limit theorem (clt) to find 



1™ 4=Ec^4^>:%-^^.(o) 



N 



■Af{0,l) (20) 
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where Af{0, 1) represents a Gaussian random variable with 
mean and variance 1. Therefore, in the limit c, — > oo, 
the local field at i = is the sum of two independent 
random variables 

h^\0)= lim h^^liO) ^ ^ImliO) + z,{0) (21) 



with zi(0) ^ A/'(0, 1). In an analogous way, we find for the 
local field of the s-spins 



hl^'iO)^ lim h'j^M^vlmliO) + V^Z2iO) (22) 

c,A' — >oo ' 

with Z2(0) ~ A/'(0, 1). 

As in the local fields of the spins (H), we separate in 
the local field (^) the terms containing the first pattern 
7^ from the rest 



^^i(0)=7.'^Ec«^-^>^(0),s,(0) 



N 



+ lil'J2'^'^n'<'^i^>M- (23) 

P=2 j = l 

In analogy with before, we call the first term the signal 
and the last term the noise. Applying the lln to the signal 
term, we get 

1 ^ 

hm - J2 c..7>, (0)s, (0) = (7>j (0),s, (0)) . (24) 



Since this term resembles strongly the main overlaps ( |l5| ) , 
we call it also an overlap and denote it by 7713(0). In gen- 
eral, this overlap is defined by 



1 ^ 

{t)^-Y,C^Jlt'^^{t)s,{t) ^=l,...,p. (25) 



In the sequel, we will treat this parameter at the same level 
as the other overlaps (|l5|). For the linked choice of patterns 
7f = efjyf, it follows from (|l|) that (7f a,(0)sfc(0)) = 
SijSik5fj.im\ Qm2 Q since the initial spin configurations tTj(O) 
and Sj(0) are independent. Therefore, we have in the ther- 
modynamic limit and at t = 

mi(0)= hm mi,,,^(0) = mi(0)mi(0). (26) 

C,JV — ^CxD 

Following the same line of arguments as before, the noise 
term converges again to a Gaussian random variable such 
that 

hf\0)^ hm hfM = JilvlniliO)+JV^zs{0) {27) 

C,J\ — ^CxD 

with 23(0) ^ Af{0, 1). This finishes the calculation of the 
local fields at time t — 0. 

At a general time t, the local fields still consist out of a 
signal term, proportional to the main overlap, and a Gaus- 
sian distributed noise part. This is due to the extreme dilu- 
tion which eliminates all common ancestors in the dynam- 
ics. Therefore, all variables {Xj = i'^vi^x i^x.j{t)\j — 



I, . . . , N; fi — 1, . . . ,p} are a set of i.i.d.r.v. and we can 
apply the CLT in the same way as in eq. (|T9|). So, we find 
for the distribution of the local fields a set of equations 
with the same structure as the eqs. (^T|), ( p2| ) and (p7|), 
viz. 

/^r' W = i'l^ml{t) + z,{t); z,{t) ^ AA(0, 1) 

h^\t) = J^lvlmlit) + JV^ Z3{t); z^{t) ~ AA(0, 1) . 

(28) 

The three Gaussian variables Zy (t) are uncorrelated. 

Using the distributions of the local fields (Eq) and re- 
marking that the joint probability of Si{t)^ai(t),S,} and 
•ql is obtained from the overlaps my(i), y = 1,2,3, the 
equations ( p^ ) and (|2^) lead immediately to the evolution 
equations for the order parameters at zero temperature 



mi(t+l) 
m\{t + l) 



i(l + ami(t))Erf( 



(ar + am\ (t) + Tm\ (t) + ml{t)) 

:1 

Vz Erf ' + J^Kt) + JVaz 



<J,T = ±l 



2a 

xErf (^ ^""2(0 + Jm\{j.) + J^z "^ ^29) 



'2a 



where a;, i = 1, 2, x ^ x. 

After some time t the dynamics reaches the point where 
the spins macroscopically equilibrate. This means that the 
main overlap becomes stationary, viz. m\j{t + 1) = my(f). 
Since the expressions for the local fields do not change 
their structure, the corresponding fixed-point equations 
are easily obtained from eq. ( p9| ) by replacing the time 
dependent quantities by their equilibrium value rriy = 
limt^oo TO^(t). In the hmit J — > 0, the equations ( p9| ) are 
consistent with the evolution equations of jl^ . 

At non-zero temperature T = 1//3 the main overlaps 
at time t read 

mi(< + l) = ((7/;i,,(5.,,(i+l))^)) 

ml{t+\) = ll^il7^l{a,{t+l))p{s,{t + l))p)) . (30) 

The thermal averages are defined by the updating rule (^) 
and can be written as 

(5,,,(t + 1))^ = tanh [/3 (/i^^^t) + 55,,(<)/if (<))] . (31) 

The stochasticity in the dynamics does not modify the 
local spin correlations when compared with the determin- 
istic dynamics. Therefore, the local fields are still dis- 
tributed according to (|2|) and we get for the order pa- 
rameters 



il(t + l) = ^ J(l 



4W) 



<T = ±1 
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Fig. 1. Capacity-temperature diagram for the ATNN for J=0.0 
(full line), J = 0.3 (dotted line), J = 1.0 (dashed line), J = 2.0 
(long dashed line) and J = 3.0 (dot-dashed line). 



X J tanh/3 (ml{t) + cjJm\{t) + yja{l + J^)y^ 
ml{t+l)= ^ ^ {ar + crniKt) + TTTiKt) + ml{t)) 

a,T = ±\ 

X y y tanh /3 (crm} {£) -f Jra\ {t) + y/ax + Jy/az) 

x/j,,.„h^(™4«)H-^™S«)H-V5i, + JVS.).(32) 

The fixed-point equations are read off from (|3^) by us- 
ing again rriy — limt^oo 'my{t), y = 1,2,3. In the zero- 
temperature Umit /9 ^ oo, the equations above reduce 
to (|29|). Moreover, in the limit J — > 0, they are consis- 
tent with the ones obtained for the asymmetric extremely 
diluted Hopfield model in |T2[ . 

4 Results 

In this section, we discuss the numerical results for the 
ATNN obtained from the fixed-point equations derived in 
the previous section. We present the capacity-temperature 
diagram indicating the regions of retrieval as a function 
of the capacity aATNN and the temperature T, and some 
representative figures illustrating the main features of the 
model. 

Due to the choice Ji — J2 — I and due to the condi- 
tion that both types of spins have a finite initial overlap 
with one condensed pattern, it turns out that the overlaps 
mi {t) and m2 (t) always converge to the same equilibrium 
values, independent of the size of the initial overlaps. (We 
forget about the superscript 1). Therefore we can restrict 
ourselves to the case mi = m2. 



The resulting capacity-temperature diagram is presen- 
ted in Fig. |l|. First, we consider the special case J = 0. 
At r = a non-zero solution for the fixed-point equations 
exists as long as a < 2/7r, indicating a transition from the 
retrieval to the non-retrieval regime at oatnn = 4/37r. 
When the temperature increases, the critical capacity de- 
creases to become zero at T = 1. The resulting transi- 
tion line in the capacity-temperature diagram is similar 
to the one of the Hopfield model [|l2| up to a rescaling of 
the capacity. This is not surprising since the structure of 
the equations of the ATNN for J = is consistent with 
those of the Hopfield model. The transition is always con- 
tinuous, and the main overlap decreases when oatnn in- 
creases. This indicates that the more embedded patterns, 
the harder the retrieval and the worse the retrieval quality. 

When the four-spin coupling is non-zero, things be- 
come different. A larger four-spin coupling makes retrieval 
possible at higher temperature and decreases the critical 
capacity at low temperature. 

For finite loading (oatnn = 0), a continuous transition 
occurs for J < 1/3 at r = 1. For larger J, the transition 
becomes discontinuous and the critical temperature be- 
comes larger. This indicates that a model with non-zero 
four-spin couplings can perform better in the presence of 
noise in the dynamics. This corresponds with the results 
in Q. The overlap toi at the critical temperature increases 
up to J = 2, meaning that the larger J the better the re- 
trieval quality. From J = 2 onwards, the increasing noise 
in the dynamics at the transition line results in a slowly 
decreasing overlap toi. We note that we always observe 
TO3 = (mi)^ when ckatnn — 0. 

At zero temperature {T — Q), increasing J implies de- 
creasing the critical capacity. The transition is always first 
order, except for J = 0. The main overlap mi at the tran- 
sition line first increases with J, but starts to decrease 
from J — 2.Q onwards. At larger J, it starts increasing 
again. The overlap ma, however, always increases and be- 
comes larger than mi for J > 4.2. The critical capacity for 
J = 1 is equal to 0.3131, which is higher than that of the 
fully connected ATNN {uc = 0.1839) This is consistent 
with the results obtained by comparing the asymmetric 
ejctremely diluted with the fully connected Hopfield model 
,|l[ 

For non-zero temperature and infinite loading, the tran- 
sition is partially continuous as long as J G [0, 1/3]. The 
larger J, the larger the temperature is where a continuous 
transition occurs. As an example, we have drawn the value 
of the overlaps at the critical capacity for J = 0.3 (Fig. ^) 
where the transition is continuous for T > 0.88. When 
J > 1/3, the transition is discontinuous for all tempera- 
ture. In Fig. ^ and Fig. |^c, we have drawn the overlaps 
at the critical capacity for J = 1 and J = 3. The over- 
lap mi exhibits a maximum at T = 0.32 and T = 0.86 
respectively while is always decreasing. 



5 Concluding remarks 

In this article, we have studied the parallel dynamics of 
the asymmetric extremely diluted ATNN with linked pat- 
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0.0 0.5 1.0 0.0 0.5 1.0 1.5 0.0 1.0 2.0 

T T T 

Fig. 2. Overlaps mi (full line) and (broken line) at the critical capacity as a function of the temperature, for a) J=0.3, b) 
3=1.0 and c) J=3.0. 



terns at arbitrary temperature. Because of the absence 
of correlations between the neurons, we have found that 
the noise of the local field at all time steps is normally 
distributed. Hence, the dynamical equations for the order 
parameters are obtained immediately. Furthermore, the 
dynamical capacity-temperature diagram is discussed. 

In the presence of the four-coupling term, the dynam- 
ics can exhibit more noise without disturbing the retrieval 
process completely. Moreover, the transition from the re- 
trieval to the non-retrieval regime becomes first order. 
This implies that the Hamming distance becomes smaller, 
even at the transition line. So in general, we can say that 
the four-coupling term enhances the retrieval quality of 
the network. 
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